natural_language_understandingfandomcom-20200214-history
Dependency parsing
Two main streams: * Transition-based ** Non-monotonic ~: http://www.aclweb.org/anthology/D/D15/D15-1162.pdf ** An important reference: Goldberg & Nirve (2013) they come close to reinforcement learning. TODO: Vaswani & Sagae (2016)Vaswani, A., & Sagae, K. (2016). Efficient Structured Inference for Transition-Based Parsing with Neural Networks and Error States. Transactions of the Association for Computational Linguistics, 4, 183–196. use a special "error state" to recognize erroneous derivations. * Graph-based Transition-based parsing Transition systems * Arc-standard * Arc-eager * Arc-hybrid (Kuhlmann et al., 2011)Marco Kuhlmann, Carlos G´ omez-Rodrıguez, and Giorgio Satta. 2011. Dynamic programming algorithms for transition-based dependency parsers. In Proceed- ings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pages 673–682. * Swap-standard * Extended arc-standard (Attardi, 2006)Attardi, G. (2006). Experiments with a Multilanguage Non-Projective Dependency Parser. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X) (pp. 166–170). New York City: Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/W06-2922 Feature selection * Traditional feature templates (TODO) * Primitive, embedded features: Chen and Manning (2014)Chen, D., & Manning, C. (2014). A Fast and Accurate Dependency Parser using Neural Networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 740–750). Doha, Qatar: Association for Computational Linguistics. * Stack LSTM: Dyer et al. (2015)Dyer, C., Ballesteros, M., Ling, W., Matthews, A., & Smith, N. A. (2015). Transition-Based Dependency Parsing with Stack Long Short-Term Memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 334–343). Retrieved from http://www.aclweb.org/anthology/P15-1033 * Bi-directional LSTMs: Kiperwasser and Goldberg (2016)Kiperwasser, E., & Goldberg, Y. (2016). Simple and Accurate Dependency Parsing Using Bidirectional {LSTM} Feature Representations. CoRR, abs/1603.0. List-based (non-directional) systems * Easy-first (Goldberg and Elhadad, 2010)Yoav Goldberg and Michael Elhadad. 2010. An effi- cient algorithm for easy-first non-directional dependency parsing. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguis- tics (NAACL HLT), pages 742–750. * Easy-first with Hierarchical Tree LSTMs (Kiperwasser & Goldberg, 2016)Kiperwasser, E., & Goldberg, Y. (2016). Easy-First Dependency Parsing with Hierarchical Tree LSTMs. Arxiv. Retrieved from http://arxiv.org/abs/1603.00375 * Dependency parsing as head selection, using bi-directional LSTMs (Zhang et al. 2016)Zhang, X., Cheng, J., & Lapata, M. (2016). Dependency Parsing as Head Selection. Retrieved from http://arxiv.org/abs/1606.01280 Graph-based systems TODO: Yu and Bohnet (2016) PDF Classifier From Yazdani & Henderson (2015)Yazdani, Majid, and James Henderson. "Incremental Recurrent Neural Network Dependency Parser with Search-based Discriminative Training." (2015): 142-152.: "To train a classifier to choose the best actions, previous work has proposed memory-based classifiers (Nivre et al., 2004), SVMs (Nivre et al., 2006), structured perceptron (Huang et al., 2012; Zhang and Clark, 2011), two-layer neural networks (Chen and Manning, 2014), and Incremental Sigmoid Belief Networks (ISBN) (Titov and Henderson, 2007b), amongst other approaches." Yazdani & Henderson (2015): recurrent neural network. Corpora Papers mainly use WSJ of Penn Treebank (TODO: confirm this). Although there is more in the treebank, only WSJ has been patched with gold NP-bracketing (Vadas & Curran, 2007)Vadas, D., & Curran, J. R. (2007). Adding noun phrase structure to the Penn Treebank. 45th Annual Meeting of the Association of Computational Linguistics, (June), 240–247. Retrieved from http://acl.ldc.upenn.edu/P/P07/P07-1031.pdf. Evaluation * Label Attachment Score (LAS): % of tokens for which a system has predicted the correct HEAD and DEPREL * Unlabeled A5achment Score (UAS): % of tokens with correct HEAD • * Label Accuracy (LA): % of tokens with correct DEPREL For in-depth discussion: Volokh (2013)Volokh, Alexander. "Performance-oriented dependency parsing." (2013). PDF Open problems Ng and Curran (2015)Ng, D., & Curran, J. R. (2015). Identifying Cascading Errors using Constraints in Dependency Parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (pp. 1148–1158). Beijing: ACL.: "Our results showthat noun phrases remain challenging for dependency parsers, both in choosing the correct head, and in determining the internal structure. Punctuation, despite being commonly ignored in parsers and evaluation, causes substantial cascading errors when misattached." McDonald and Nivre (2007)McDonald, R., & Nivre, J. (2007). Characterizing the Errors of Data-Driven Dependency Parsing Models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).: "Error propagation is an issue for MaltParser, which typically performs worse on long sentences, long dependency arcs and arcs higher in the graphs". This is true for gready transition-based parsers in general. Beam search can alleviate the problem but doesn't solve it. Global inference (Andor et al., 2016)Andor, D., Alberti, C., Weiss, D., Severyn, A., Presta, A., Ganchev, K., Petrove, S., Collins, M. (2016). Globally Normalized Transition-Based Neural Networks. arXiv.org, cs.CL. might help but empirical verification is needed. References Category:Dependency parsing